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Abstract. Some recent approaches for scalable offline partial evalua- 
tion of logic programs include a size-change analysis for ensuring both so 
called local and global termination. In this work — inspired by experimen- 
tal evaluation — we introduce several improvements that may increase 
the accuracy of the analysis and, thus, the quality of the associated spe- 
cialized programs. We aim to achieve this while maintaining the same 
complexity and scalability of the recent works. 



1 Introduction 

Partial evaluation [1] is a well-known technique for program specialization. In 
this work, we consider the so called offline approach, which consists of two clearly 
separated phases: binding-time analysis and proper specialization. Basically, the 
binding-time analysis should annotate the source code in order to drive the 
specialization process. Roughly speaking, 

— every atom is annotated as either unfold (the atom can be unfolded) or memo 
(the atom should not be unfolded), and 

— every predicate's argument is classified as either static (definitely known at 
specialization time) or dynamic (possibly unknown at specialization time). 

We say that the annotations are safe if static arguments are actually ground at 
specialization time and the termination of the specialization is ensured. Termi- 
nation issues are usually classified into local and global termination: 

— local termination ensures that no atom is infinitely unfolded; 

— global termination guarantees that only finitely many atoms are specialized 
(i.e., that we do not create infinite specializations of the same predicate). 

The main component of a binding-time analysis is a termination analysis that 
allows us to guarantee both local and global termination of the specialization 
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process. In jS], a strong termination analysis — based on the so called size-change 
termination principle [2] — for logic programs is introduced. Strong termination 
means termination w.r.t. all selection rules. Although this is a rather strong 
condition, it allows us to design much faster binding-time analysis (see [6]). 

In this paper, we identify several weaknesses of the original size-change anal- 
ysis of [S] and present different proposals that improve the accuracy of the spe- 
cialization process. 

2 Size-Change Termination Analysis 

In this section, we informally present the basis of the quasi-termination analysis 
for logic programs of [5]. 

We say that a query Q is strongly terminating w.r.t. a program P if every 
SLD derivation for Q with P is finite. We denote by calls^(Qo) the set of calls 
in the computations of a goal Qo within a logic program P and a computation 
rule 1Z. The query Q is strongly quasi-terminating if, for every computation rule 
1Z, the set call^(Q) contains finitely many nonvariant atoms. A program P is 
strongly (quasi-)terminating w.r.t. a set of queries Q if every Q £ Q is strongly 
(quasi-)terminating w.r.t. P. For conciseness, in the remainder of this paper, we 
write "(quasi-)termination" to refer to "strong (quasi-)termination." 

Size-change analysis is based on constructing graphs that represent the de- 
crease of the arguments of a predicate from one call to another. For this purpose, 
some ordering on terms is required. 

Definition 1 (reduction pair). We say that Qz,y) is a reduction pair if £3 
is a quasi-order and >- is a well-founded order where both £3 and >~ are closed 
under substitutions and compatible (i.e.. £3 o y C y and y o ^ C >- but ^3 C >- 
is not necessary). 

In logic programming, however, termination analyses usually rely on the use of 
norms which measure the size of terms. In [5], reduction orders (£3, >-) induced 
from symbolic norms 1 1 • 1 1 are used: 

Definition 2 (symbolic norm [3117] ). Given a termt, 



where m and ki,...,k n are non-negative integer constants depending only on 
f/n. Note that we associate a variable over integers with each logical variable 
(we use the same name for both since the meaning is clear from the context). 

The introduction of variables in the range of the norm provides a simple mech- 
anism to express dependencies between the sizes of terms. 

The associated induced orders (£3, y) are defined as follows: t\ y t 2 (respec. 
h h) if \\tiv\\ > \\t2&\\ (respec. ||iicr|| ^ H^cll) for all substitution a that 
makes | |tier| | and | \t20~\ \ ground (e.g., an integer constant). Two popular instances 
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Fig. 1. Size-change graphs for incList 



of symbolic norms are the symbolic term-size norm 1 1 • | |t s (which sums the arities 
of the term symbols) and the symbolic list-length norm \ \ • \ \u (which counts the 
number of elements of a list), e.g., 

f(X,Y,a) y ts f(X,a,b) since ||/(X, Y, a)\\ u = X + Y + 3 > X + 3 = ||/(X, a, b)\\ ts 
[X\R] >Zu [s(X)\R] since \\[X\R}\\ U = R+ 1 > R+ 1 = ||[s(z)|E]||h 

Now, we produce a size-change graph Q for every pair (H,Bi) of every clause 
H *— Bi, . . . ,B n of the program, with edges between the arguments of H and 
Bi when the size of the corresponding terms decrease w.r.t. a given reduction 
pair 

Example 1. Consider the following simple program: 
(ci) incList([], _,[}). 

(c 2 ) incList([X\R],I, L) <- iList(X, R, I, L). 

(c 3 ) iList(X, R, I, [XI\RI]) <- add(I, X, XI), incList(R, I, RI). 

(c 4 ) add(0,Y,Y). 

(c 5 ) add(s(X),Y,s(Z)) <- add(X,Y,Z). 

Let (£;, >-) be the reduction pair induced by the symbolic term-size norm || • || ts . 
Here, we have four size-change graphs, depicted in Fig. [TJ which are associated 
to clauses c 2 (graph Qi), c 3 (graphs Q 2 and Q 3 ) and c 5 (graph Q 4 ). 

In order to identify the program loops, we should compute roughly a transi- 
tive closure of the size-change graphs by composing them in all possible ways. 
Basically, given two size-change graphs: 



g= ({l p ,...,n p },{lq,...,m q },Ei) H= ({!,,. 



f}> {lr, ■ ■ ■ , lr}, El) 



w.r.t. the same reduction pair >-), their concatenation is defined by 
G»H= ({l p , n p }, {l r , l r }, E) 
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where E contains an edge from i p to k r iff E\ contains an edge from i p to some 
j q and E2 contains an edge from j q to k r . Furthermore, if some of the edges are 
labeled with then so is the edge in E; otherwise, it is labeled with 

In particular, according to [5], we only need to consider the idempotent size- 
change graphs Q with Q • Q = Q for analyzing the termination of the program. 

Example 2. For the program of Example[T] we compute the following idempotent 
size-change graphs: 
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that represent how the size of the arguments of the three potentially looping 
predicates changes from one call to another. 

Once the idempotent size-change graphs of a program have been computed, the 
following results holdH 

Termination: An atom A is (strongly) terminating if every idempotent size- 
change graph for p/n contains at least one edge i p — ► i p such that, for every 
computation rule 1Z and atom p(t\, . . . ,t n ) S calls^(A), the argument U is 
instantiated enough w.r.t. the considered symbolic norm. 
Clearly, the set calls^(A) is often infinite. Therefore, we usually consider an 
approximation based on a division that classifies every predicate's argument 
as either static or dynamic and check that the i-th argument of p is classified 
as static (rather than checking that ti is instantiated enough in all possible 
calls from A). 

For instance, given a division that classifies the arguments of add as follows: 

add >— > (static, dynamic, dynamic) 

and according to the idempotent size-change graphs of Example [3J we have 
that all calls to add terminate since there is an edge l a dd ladd in the 
idempotent size-change graph and the first argument of add is classified as 
static. 

Quasi-termination: An atom A is (strongly) quasi-terminating if it is either 
terminating or every idempotent size-change graph for p/n contains, for all 

i p (i — l,...,n) an edge j p — ► i p for some j p , with R 6 {>~,^3} (i.e., 
all arguments are bounded by the value of some argument in a previous 
call). Furthermore, the considered norms must be bounded (see Definition [3] 
below) . 

3 A term t is instantiated enough [317] w.r.t. a symbolic norm || • || if ||t|| is an integer 
constant. 
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For instance, according to the idempotent size-change graphs of Example [21 
an atom add(X, Y, Z) is quasi-terminating since there is an input edge to 
every argument. 

In [S], the termination condition is used for ensuring the local termination of 
partial evaluation, while the quasi-termination condition is used for ensuring its 
global termination. Basically, 

— we reclassify as unfold those atoms which are terminating w.r.t. a given 
division (and with memo otherwise) and 

— we mark with dynamic the argument of an atom if there is no input edge to 
this argument in some idempotent size-change graph, i.e., if the atom is not 
quasi-terminating. 

Example 3. Given the idempotent size-change graphs of Example [2] and a divi- 
sion that classifies the predicates' arguments as follows: 

incList i— > (dynamic, static, dynamic) 

iList i ► (dynamic, dynamic, static, dynamic) 

add i— > (static, dynamic, dynamic) 

we have that 

— incList and iList are marked with memo while add is marked with unfold, 
and 

— no argument should be re-classified as dynamic. 
3 Improving Size-Change Analysis 

In this section, we introduce several extensions of the size-change analysis that 
may improve the accuracy of the specialization process by taking into account 
some basic properties of partial evaluation. 

3.1 Non-Bounded Norms for Global Termination 

Let us recall the notion of bounded norm required in [5] for ensuring quasi- 
termination: 

Definition 3 (bounded norm). We say that a symbolic norm || • || is bounded 
if the set {s \ \\t\\ ^ ||s||} contains a finite number of nonvariant terms for any 
term t. 

Roughly speaking, a symbolic norm is bounded if, for every term t, there exist 
only finitely many nonvariant terms whose weights are lesser than or equal to 
that of t w.r.t. the symbolic norm || ■ ||. 

Unfortunately, many symbolic norms are not bounded; e.g., the symbolic 
list- length norm is not bounded since, given the term p([a]), we have an infinite 
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set {p([a\) , p([f (a)]) 7 p([f (f (a))]) , . . .} of non-variant terms such that ||[a]||zz = 

II[/(o)]||h = I|[/(/(o))]||h = ••• = !• 

In the context of partial evaluation, however, symbolic norms need not be 
bounded if the problematic parts of the terms are generalized at the global 
level. For instance, we can safely use the symbolic list-length norm as long as 
the list elements are replaced by fresh variables in the global level. This idea, 
already sketched in [S] , is formalized by means of the most general generalization 
operator: 

Definition 4 (mgg). Let \ \ ■ \\ be a symbolic norm. Given a term t, we denote 
by mgg^'^(t) the most general generalization oft such that \ \t\ \ = \\mgg'>'>>(t)\\. 
We also let mgg^ (p(h, . . . , t n )) = p(mgg^ (ti) , . . . , mgg^ (t n )) . 

For instance, given the term t = [s(N),b], we have mgg"'""(t) = [X, Y] but 
mgg^°(t) = [s(N),b}. 

Moreover, the quasi-termination result in [5] also requires that all calls en- 
countered during partial evaluation should be linear w.r.t. the dynamic variables 
(i.e., no variable marked as dynamic could appear more than once in a call). 
However, this is not a real problem in the context of partial evaluation since all 
dynamic parts of terms are replaced by fresh variables in the global level anyway. 

Therefore, one can ensure the global termination of partial evaluation when 
using arbitrary symbolic norms in the size-change analysis as long as 

— dynamic parts of arguments are replaced by fresh variables in the global level 
(this is already done by current offline partial evaluators) and 

— an atom A is replaced by mgg^'^(A) in the global level, where || • || is the 
symbolic norm used in the size-change analysis. 

3.2 Maximizing "Unfold" Annotations 

The original approach of [5] does not take into account that different idempotent 
size-change graphs may represent a single loop. For instance, the idempotent size- 
change graphs for both incList and iList actually represent the same program 
loop. Therefore, it would be safe to annotate only one of these predicates with 
"memo" and the other one with "unfold". 

In order to avoid unnecessary memo annotations, one can slightly extend the 
original annotation procedure as follows: 

— First, every size-change graph is labeled with a unique identifier (e.g., Gi, 
G 2 , . . ■ , as in Fig. [[J. 

— Then, the concatenation of graphs is performed as before, but now every 
concatenation keeps a set with the identifiers of the graphs involved in the 
concatenation. We note that the set of identifiers is not taken into account 
during the concatenation process, i.e., two size-change graphs that only differ 
in the associated set of identifiers are considered equal (therefore, the com- 
plexity of the concatenation process, the most expensive part of the analysis, 
remains the same). 

For instance, the labeled idempotent size-change graphs of Example [5] would 
now be as depicted in Fig. [5J 
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Fig. 2. Labeled idempotent size-change graphs for incList 



— The computed idempotent size-change graphs can now be grouped into 
equivalence classes so that two idempotent size-change graphs belong to the 
same class if they are labeled with the same set of identifiers. 

— Finally, we should only annotate with "memo" one predicate for every equiv- 
alence class of idempotent size-change graphs. 

For instance, as mentioned in Example [3l both incList and iList are marked 
with memo in the original framework. Now, however, only one of them would be 
marked with memo (and the other one with unfold). 

Clearly, there is a degree of freedom when choosing which is the idempotent 
size-change graph of a given class that should be marked with memo. For this 
purpose, one can define appropriate heuristics that minimize the number of 
memo annotations by, e.g., assigning a higher priority to those predicates that 
belong to more than one class. 

3.3 Right-Propagation of Bindings 

An advantage of the size-change analysis of [8] is that it is independent of a 
particular selection rule. As mentioned in the introduction, this property makes 
the associated binding-time analysis much faster; unfortunately, it is also less 
accurate. 

In some cases, we can improve this situation by assuming some partial knowl- 
edge on the evaluation order@ For instance, we could first run a left-termination 
analysis (like, e.g., the one based on the binary unfoldings 2J) or rely on user's 
annotations that identify some atoms as "completely unfoldablc" (note that an 
annotation unfold only means that the atom can be unfolded one step; then the 
annotations of the predicates in the unfolded goal should be followed) . 

In this case, we can improve the accuracy of the size-change analysis by using 
an inter-argument size analysis like that calculated from the convex hull of pQ. 
For instance, given the program 

p(X) <- q(X,Y),p(Y). 
g(*(0),0). 

q(s(X),Y) <- q(X,Y). 



4 We thank Maurice Bruynooghe for suggesting this improvement. 
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the size-change graph associated to p/1 originally contains no edge (since we 
do not know the size relation between X and Y). Now, if we assume that q/2 
is completely unfoldable, then we can use the output of the convex hull of [1J 
(using a term-size norm): 

q(A, B) <- {A > B, B = 0, A > 1} 

for propagating some additional constraints to the right of q. In this way, one can 
easily infer that the size-change graph for p/1 should contain an edge l p > l p . 

Let us note that, in principle, the accuracy of the size-change analysis of 
[8] could not be improved by adding inter-argument size relations to size-change 
graphs, since inter-argument relations usually require the atoms to be completely 
unfolded (i.e., they represent relations that hold for success patterns). This as- 
sumption is not generally true in the setting of [5j where partial evaluations are 
possible. 

4 Discussion 

We have recently undertaken the implementation of a binding-time analysis for 
the offline partial evaluation of Prolog programs which is based on the size- 
change analysis of [8]. In this paper, we have introduced several improvements 
that may allow us to overcome the main weaknesses of [8]. An experimental 
evaluation will be conducted in order to assess their effectiveness in practice. 
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